Parsimonious Topic Models with Salient Word Discovery

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Statistical word sense aware topic models

LDA has been proved effective in modeling the semantic relation between surface words. This semantic information in the document collection is useful to measure the topic distribution for a document. In general, a surface word may significantly contribute to several topics in a document collection. LDA measures the contribution of a surface word to each topic and considers a surface word to be ...

متن کامل

Improving Topic Models with Latent Feature Word Representations

Probabilistic topic models are widely used to discover latent topics in document collections, while latent feature vector representations of words have been used to obtain high performance in many NLP tasks. In this paper, we extend two different Dirichlet multinomial topic models by incorporating latent feature vector representations of words trained on very large corpora to improve the word-t...

متن کامل

Gaussian LDA for Topic Models with Word Embeddings

Continuous space word embeddings learned from large, unstructured corpora have been shown to be effective at capturing semantic regularities in language. In this paper we replace LDA’s parameterization of “topics” as categorical distributions over opaque word types with multivariate Gaussian distributions on the embedding space. This encourages the model to group words that are a priori known t...

متن کامل

Topic Modelling with Word Embeddings

English. This work aims at evaluating and comparing two different frameworks for the unsupervised topic modelling of the CompWHoB Corpus, namely our political-linguistic dataset. The first approach is represented by the application of the latent DirichLet Allocation (henceforth LDA), defining the evaluation of this model as baseline of comparison. The second framework employs Word2Vec technique...

متن کامل

Inhomogeneous Parsimonious Markov Models

We introduce inhomogeneous parsimonious Markov models for modeling statistical patterns in discrete sequences. These models are based on parsimonious context trees, which are a generalization of context trees, and thus generalize variable order Markov models. We follow a Bayesian approach, consisting of structure and parameter learning. Structure learning is a challenging problem due to an over...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Knowledge and Data Engineering

سال: 2015

ISSN: 1041-4347

DOI: 10.1109/tkde.2014.2345378